library(skimr)
library(tidyverse)
library(caret) # For featureplot, classification report
library(corrplot) # For correlation matrix and PCA contributionplots
library(AppliedPredictiveModeling)
library(mice) # For data imputation
library(VIM) # For missing data visualization
library(gridExtra) # For grid plots
library(pROC) # For AUC calculations
library(ROCR) # For ROC and AUC plots
library(dendextend) # For dendrograms
library(factoextra) # For PCA plots
library(e1071) # For SVMThis dataset is composed of real patient responses to two questionnaires related to ADHD and Mood Disorder and a variety of demographic, abuse and drug use variarables. For each questionnaire, the responses to individual questions are provided along with total scores. Links to the actual questions are provided below:
The first part of this work will make use of unsupervised learning techniques such as Principal Component Analysis (PCA) and clustering in an attempt to discover structures in the data. The second part will explore support vector machines in a supervised learning exercise to predict whether an individual has attempted suicide.
The dataset is composed of 54 variables and 175 observations. The data is coded as numeric and holds 33 observations that have some level of missing data. A summary of the variable distributions is provided below:
adhd_raw <- read.csv('https://raw.githubusercontent.com/maelillien/data622/main/hw4/adhd_data.csv', header = TRUE)
adhd <- adhd_raw
head(adhd)| Name | adhd %>% select(-c(ADHD.Q… |
| Number of rows | 175 |
| Number of columns | 21 |
| _______________________ | |
| Column type frequency: | |
| factor | 1 |
| numeric | 20 |
| ________________________ | |
| Group variables | None |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| Initial | 0 | 1 | FALSE | 109 | DB: 5, CM: 4, DJ: 4, JM: 4 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| Age | 0 | 1.00 | 39.47 | 11.17 | 18 | 29.5 | 42 | 48.0 | 69 | ▆▅▇▅▁ |
| Sex | 0 | 1.00 | 1.43 | 0.50 | 1 | 1.0 | 1 | 2.0 | 2 | ▇▁▁▁▆ |
| Race | 0 | 1.00 | 1.64 | 0.69 | 1 | 1.0 | 2 | 2.0 | 6 | ▇▁▁▁▁ |
| ADHD.Total | 0 | 1.00 | 34.32 | 16.70 | 0 | 21.0 | 33 | 47.5 | 72 | ▃▆▇▆▂ |
| MD.TOTAL | 0 | 1.00 | 10.02 | 4.81 | 0 | 6.5 | 11 | 14.0 | 17 | ▃▃▆▇▇ |
| Alcohol | 4 | 0.98 | 1.35 | 1.39 | 0 | 0.0 | 1 | 3.0 | 3 | ▇▂▁▁▆ |
| THC | 4 | 0.98 | 0.81 | 1.27 | 0 | 0.0 | 0 | 1.5 | 3 | ▇▁▁▁▃ |
| Cocaine | 4 | 0.98 | 1.09 | 1.39 | 0 | 0.0 | 0 | 3.0 | 3 | ▇▁▁▁▅ |
| Stimulants | 4 | 0.98 | 0.12 | 0.53 | 0 | 0.0 | 0 | 0.0 | 3 | ▇▁▁▁▁ |
| Sedative.hypnotics | 4 | 0.98 | 0.12 | 0.54 | 0 | 0.0 | 0 | 0.0 | 3 | ▇▁▁▁▁ |
| Opioids | 4 | 0.98 | 0.39 | 0.99 | 0 | 0.0 | 0 | 0.0 | 3 | ▇▁▁▁▁ |
| Court.order | 5 | 0.97 | 0.09 | 0.28 | 0 | 0.0 | 0 | 0.0 | 1 | ▇▁▁▁▁ |
| Education | 9 | 0.95 | 11.90 | 2.17 | 6 | 11.0 | 12 | 13.0 | 19 | ▁▅▇▂▁ |
| Hx.of.Violence | 11 | 0.94 | 0.24 | 0.43 | 0 | 0.0 | 0 | 0.0 | 1 | ▇▁▁▁▂ |
| Disorderly.Conduct | 11 | 0.94 | 0.73 | 0.45 | 0 | 0.0 | 1 | 1.0 | 1 | ▃▁▁▁▇ |
| Suicide | 13 | 0.93 | 0.30 | 0.46 | 0 | 0.0 | 0 | 1.0 | 1 | ▇▁▁▁▃ |
| Abuse | 14 | 0.92 | 1.33 | 2.12 | 0 | 0.0 | 0 | 2.0 | 7 | ▇▂▁▁▁ |
| Non.subst.Dx | 22 | 0.87 | 0.44 | 0.68 | 0 | 0.0 | 0 | 1.0 | 2 | ▇▁▃▁▁ |
| Subst.Dx | 23 | 0.87 | 1.14 | 0.93 | 0 | 0.0 | 1 | 2.0 | 3 | ▆▇▁▅▂ |
| Psych.meds. | 118 | 0.33 | 0.96 | 0.80 | 0 | 0.0 | 1 | 2.0 | 2 | ▇▁▇▁▆ |
The dataset is modified to include an EducationLevel categorical variable derived from the numerical Education variables representing the years of schooling. The Abuse column is unfolded into 3 binary variables indicating the occurence of the 3 types of abuse. The original Abuse variable is dropped.
We work with a multiple subsets of the data for subsequent parts this report. Some analyses make use of the entire set of questionnaire reponses while others use only the total score.
# Renaming variables
adhd <- rename(adhd, ADHDTotal = ADHD.Total, MDTotal = MD.TOTAL, Sedatives = Sedative.hypnotics, CourtOrder = Court.order,
Violence = Hx.of.Violence, Conduct = Disorderly.Conduct, NonSubstDX = Non.subst.Dx, SubstDX = Subst.Dx, PsychMeds = Psych.meds.)
# Drop Initial column
adhd <- adhd %>% select(-c(Initial))
# Shift sex variable to 0 and 1
adhd$Sex <- adhd$Sex-1
# Re-coding Education Variable, College = Education > 12, HS = Education > 8 & <= 12, MS = Education <= 8
adhd$EducationLevel <- ifelse(adhd$Education <= 8, 1, ifelse(adhd$Education <= 12 & adhd$Education > 8, 2, ifelse(adhd$Education > 12, 3, 99)))
# Creating new variables based on type of abuse
adhd$AbuseP <- as.numeric(adhd$Abuse == 1 | adhd$Abuse == 4 | adhd$Abuse == 5 | adhd$Abuse == 7)
adhd$AbuseS <- as.numeric(adhd$Abuse == 2 | adhd$Abuse == 4 | adhd$Abuse == 6 | adhd$Abuse == 7)
adhd$AbuseE <- as.numeric(adhd$Abuse == 3 | adhd$Abuse == 5 | adhd$Abuse == 6 | adhd$Abuse == 7)
adhd <- adhd %>% select(-c(Abuse))
# Forming data subsets: full set of variables or reduced (totals only)
adhd.full <- adhd
adhd.red <- adhd %>% select(c(Age, Sex, Race, ADHDTotal, MDTotal, Alcohol:AbuseE))
# Set all characters to numeric
adhd.red <- mutate_all(adhd.red, function(x) as.numeric(as.character(x)))
adhd.full <- mutate_all(adhd.full, function(x) as.numeric(as.character(x)))The dataset contains a few missing values. The PsychMeds variable mostly contained missing values and was dropped entirely. A few observations were quite sparse and only contained basic demographic and questionnaire score columns. In order to avoid biasing the dataset with imputed values, we preferred to drop all observations with missing values from the dataset. The resulting dataset contains 33 fewer observations with 142 complete rows and 19 columns.
##
## Variables sorted by number of missings:
## Variable Count
## PsychMeds 0.67428571
## SubstDX 0.13142857
## NonSubstDX 0.12571429
## AbuseP 0.08000000
## AbuseS 0.08000000
## AbuseE 0.08000000
## Suicide 0.07428571
## Violence 0.06285714
## Conduct 0.06285714
## Education 0.05142857
## EducationLevel 0.05142857
## CourtOrder 0.02857143
## Alcohol 0.02285714
## THC 0.02285714
## Cocaine 0.02285714
## Stimulants 0.02285714
## Sedatives 0.02285714
## Opioids 0.02285714
## Age 0.00000000
## Sex 0.00000000
## Race 0.00000000
## ADHDTotal 0.00000000
## MDTotal 0.00000000
# Discard PsychMeds
adhd.red.complete <- adhd.red %>% select(-c(PsychMeds))
adhd.full.complete <- adhd.full %>% select(-c(PsychMeds))
# Keep only complete cases
adhd.red.complete <- adhd.red.complete[complete.cases(adhd.red.complete),]
adhd.full.complete <- adhd.full.complete[complete.cases(adhd.full.complete),]Clustering refers to a broad set of techniques for finding subgroups, or clusters, in a dataset. We seek to partition observations into distinct groups so that the observations within each group are quite similar to each other, while observations in different groups are quite different from each other. The most popular clustering approaches are K-means and Hierarchical Clustering (HC). While the former requires a pre-specified number of clusters k, the latter does not. HC is a bottom-up or agglomerative clustering approach which results in an upside-down tree representation, built from the leaves and combined into clusters up to the trunk. Clusters are identified by horizontal cuts across the dendrogram.
In this section, we explore the use of Hierarchical Clustering on two portions of the data. The first uses only the questionnaire responses to ADHD while the second uses the total questionnaire scores for both surveys as well as the other variables (demographic, drugs, abuse, etc). The latter is referred to as the ‘reduced’ dataset.
Clustering typically requires the variables to be scaled in order to avoid more weight to variables using a larger range of values. However, when all the variables under conideration are measured on the same scale, which is the case when only comparing survey responses, it can be appropriate to leave the variables unscaled.
With HC, the concept of dissimilarity between a pair of observations needs to be extended to a pair of groups of observations. This extension is achieved with the notion of linkage, which defines the dissimilarity between two groups of observations. The resulting dendrogram heavily depends on the choice of linkage. The most popular linkages are complete and average because they tend to result in more balanced clusters.
Using only the individual unscaled responses to the ADHD Questionnaire, we obtain the following dendrogram structure using complete linkage. In this case, complete linkage provided the best balancing and a cutoff into 3 clusters looked appropriate. In order to gain insight into these clusters, we need to look at the distribution of the variables within each of them.
For these 3 clusters, we can make the following observations:
We can establish a ranking for this clustering based on the monotic rise in the meanADHD and meanMD across clusters. From less to most severe: Cluster 3, Cluster 2, Cluster 1. With this information, a treatment plan could be developped to focus on clusters 2 and 1, with the latter requiring special attention.
hc0 <- adhd.full %>% select(ADHD.Q1:ADHD.Q18) %>% dist(method = "euclidean") %>% hclust(method = "complete")
dend0 <- hc0 %>% as.dendrogram
sub_grp0 <- cutree(dend0, k = 3, order_clusters_as_data = TRUE)
dend0 %>% set("branches_k_color", k = 3) %>% set("labels", "") %>% plot(main = "Hierarchical Clustering ADHD Questionnaire")adhd.full %>%
mutate(cluster = sub_grp0) %>%
group_by(cluster) %>%
summarise(meanAge = mean(Age), meanMD = mean(MDTotal), meanADHD = mean(ADHDTotal), count = n()) %>%
gather(var,value,meanAge:count) %>%
ggplot(aes(cluster,value,fill=cluster)) +
geom_col() + facet_grid(var ~ ., scales="free_y") +
geom_text(aes(label=round(value,1)), vjust=1.6, color="white", size=3.5) +
ggtitle('ADHD Questionnaire Cluster Distribution') +
theme_minimal()For a contrasting analysis, we looked at clustering based on the dataset which included only the total questionnaire scores and dropped observations with missing values. The variables were scaled to balance out the contribution of the high values for scores and age. We explore the use of complete, average and Ward linkages.
Unlike the previous clustering, adding the other variables results in a less balanced dendrogram. In this case, there are a few clusters containing only a few observations and two large clusters containing the majority of the data. We can make a few additional observations:
Based on the variable distributions, it might be tempting to want to group clusters 4, 5 and 6 together since they represent a younger cohort with similar meanMD and meanADHD scores. However, hierachichal clustering considers more variables than the aforementioned few and this kind of grouping would violate the measure of similarity as determined by the y axis of the dendrogram.
A proposed ADHD and Mood Disorder ranking in 4 levels from least severe to most severe is: Cluster 2, Clusters 3 (younger) & 4 (older), Clusters 5 (younger) & 1 (older), Cluster 6.
hc1 <- adhd.red.complete %>% scale %>% dist(method = "euclidean") %>% hclust(method = "complete")
dend1 <- hc1 %>% as.dendrogram
sub_grp1 <- cutree(dend1, k = 6)
dend1 %>% set("branches_k_color", k = 6) %>% set("labels", "") %>% plot(main = "Hierachical Clustering Reduced Dataset + Complete Linkage")adhd.red.complete %>%
mutate(cluster = sub_grp1) %>%
group_by(cluster) %>%
summarise(meanAge = mean(Age), meanMD = mean(MDTotal), meanADHD = mean(ADHDTotal), count = n()) %>%
gather(var,value,meanAge:count) %>%
ggplot(aes(cluster,value,fill=cluster)) +
geom_col() + facet_grid(var ~ ., scales="free_y") +
geom_text(aes(label=round(value,1)), vjust=1.6, color="white", size=3.5) +
ggtitle('Reduced Dataset + Complete Linkage Cluster Distribution') +
theme_minimal()Average linkage completely changes the dendrogram representation. The same number of clusters, k=6 seemed appropriate. Similarly to complete linkage, one large cluster is identified, this time containing the vast majority of the observations. This obscures the analysis beyond what we can establish about the small clusters.
hc2 <- adhd.red.complete %>% scale %>% dist(method = "euclidean") %>% hclust(method = 'average')
dend2 <- hc2 %>% as.dendrogram
sub_grp2 <- cutree(dend2, k = 6)
dend2 %>% set("branches_k_color", k = 6) %>% set("labels", "") %>% plot(main = "Hierachical Clustering Reduced Dataset + Average Linkage")adhd.red.complete %>%
mutate(cluster = sub_grp2) %>%
group_by(cluster) %>%
summarise(meanAge = mean(Age), meanMD = mean(MDTotal), meanADHD = mean(ADHDTotal), count = n()) %>%
gather(var,value,meanAge:count) %>%
ggplot(aes(cluster,value,fill=cluster)) +
geom_col() + facet_grid(var ~ ., scales="free_y") +
geom_text(aes(label=round(value,1)), vjust=1.6, color="white", size=3.5) +
ggtitle('Reduced Dataset + Average Linkage Cluster Distribution') +
theme_minimal()While average linkage is a popular option, the resulting clustering above was somewhat unattractive. Here we consider another linkage method to expand the analysis. Ward linkage makes use of Ward’s minimum variance criterion whcih seeks to minimizes the total within-cluster variance. The resulting dendrogram is visually attractive and more balanced than before. A cutoff at k=5 clusters seems appropriate. While more balanced, the insights drawn from the variable distributions across the clusters are not as obvious as when using complete linkage. It is harder to determine the least and most severe clusters as the grouping with the lower meanMD score also has the highest meanADHD score. Clusters 1 and 3 have similar questionnaire scores while differing by more than 10 years on average.
hc3 <- adhd.red.complete %>% scale %>% dist(method = "euclidean") %>% hclust(method = 'ward.D')
dend3 <- hc3 %>% as.dendrogram
sub_grp3 <- cutree(dend3, k = 5)
dend3 %>% set("branches_k_color", k = 5) %>% set("labels", "") %>% plot(main = "Hierachical Clustering Reduced Dataset + Ward Linkage")adhd.red.complete %>%
mutate(cluster = sub_grp3) %>%
group_by(cluster) %>%
summarise(meanAge = mean(Age), meanMD = mean(MDTotal), meanADHD = mean(ADHDTotal), count = n()) %>%
gather(var,value,meanAge:count) %>%
ggplot(aes(cluster,value,fill=cluster)) +
geom_col() + facet_grid(var ~ ., scales="free_y") +
geom_text(aes(label=round(value,1)), vjust=1.6, color="white", size=3.5) +
ggtitle('Reduced Dataset + Ward Linkage Cluster Distribution') +
theme_minimal()We use a tanglegram to obtain side-by-side comparisons on the clusters obtained using the different linkage methods. Here we only compare the complete and average linkage clusterings which are similar in terms of cluster imbalance. The first thing to notice is the consistency of groupings at the higher ends of the hiearchies. This is shown by the horizontal ribbon linking the two clustering and indicates that the selected observations end up in the corresponsing cluster representation. Naturally, a large number of observations find correspondence in the opposing largest clusters. A few observations cross the tanglegram to the opposite corner from smaller clusters. Further study of these patients might be of interest.
dl <- dendlist(
dend1 %>%
set("labels_col", k=6) %>%
set("branches_lty", 1) %>%
set("branches_k_color", k = 6),
dend2 %>%
set("labels_col", k=6) %>%
set("branches_lty", 1) %>%
set("branches_k_color", k = 6)
)
# Plot them together
tanglegram(dl,
common_subtrees_color_lines = TRUE, highlight_distinct_edges = TRUE, highlight_branches_lwd=FALSE,
margin_inner=7,
lwd=2,
show_labels = FALSE
)
title("Complete Linkage vs Average")[MI: comment or delete]
dl <- dendlist(
dend1 %>%
set("labels_col", k=6) %>%
set("branches_lty", 1) %>%
set("branches_k_color", k = 6),
dend3 %>%
set("labels_col", k=5) %>%
set("branches_lty", 1) %>%
set("branches_k_color", k = 5)
)
# Plot them together
tanglegram(dl,
common_subtrees_color_lines = TRUE, highlight_distinct_edges = TRUE, highlight_branches_lwd=FALSE,
margin_inner=7,
lwd=2,
show_labels = FALSE
)
title("Complete Linkage vs Ward Linkage")Principal Comnponent Analysis (PCA) is a dimensionality reduction technique where a dataset is transformed to use p eigenvectors of the covariance matrix instead of the original number of predictors n, where p < n. The number of eigenvectors p is selected by looking at the sorted eigenvalues and determining a threshold percentage of variance explained and the resulting p.
The method seeks to project the data into a lower dimensional space where each axis (or principal component) captures the most variability in the data subject to the condition of being uncorrelated to the other axes. This last condition is important for dimensionality reduction in the sense that large datasets can contain many correlated variables which hold no additional information.
An eigenvalue > 1 indicates that PCs account for more variance than accounted by one of the original variables in standardized data. This is commonly used as a cutoff point for which PCs are retained. This holds true only when the data are standardized. We can also limit the number of component to that number that accounts for a certain fraction of the total variance, for example 70%.
This section focuses on the subset of the data containing only the individual responses to the ADHD questionnaire. The table below displays the first 10 eigenvalues obtained from the decomposition. Following from the cutoff decription above, our selection of dimensions can be based on the number of scaled eigenvalues that are greater than 1 or on a certain percentage of cummulative variance explained. Another way to select the number of PCs to consider is to study the scree plot provided below, which is simply a visual representation of the variance explained by each component. We typically look for an elbow in the plot to make our selection. In our case, the scree plot elbow occurs at Dimensions = 2 and the eigenvalue threshold at Dimensions = 3 which account for 59.2% and 64.8% respectively.
We can study the individual contributions of each questionnaire response to the principal components using the plot below. We observe that the contributions to the first principal component (Dim. 1) by the variables is roughly equivalent. However, Q16 bears a lot of weight in the 2nd dimension as Q5 does in the third dimension.
res.pca.adhd <- adhd.full %>% select(ADHD.Q1:ADHD.Q18) %>% prcomp(scale = FALSE)
corrplot(get_pca_var(res.pca.adhd)$contrib, is.corr=FALSE) The cummulative contribution across the first 3 dimensions is summarized below. Q5 bears the most overall importance, followed by Q16 and Q4. It is worth diving into the actual questionnaire to look up what each question is asking to see if any insights can be drawn to explain the variability. The questions are listed below.
adhd.full %>% select(ADHD.Q1:ADHD.Q18) %>%
prcomp(scale = FALSE) %>%
fviz_contrib(choice = "var", axes = 1:3, top = 10)We can visualize these contribution in 2 dimensions using the first two PCs as shown below. On this plot, we look at the groupings and directions of the vectors. Positively correlated variables are grouped together. Negatively correlated variables are positioned on opposite sides of the plot origin. The distance between variables and the origin measures the contribution of the variables. We make the following observations:
fviz_pca_var(res.pca.adhd,
col.var = "contrib", # Color by contributions to the PC
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE # Avoid text overlapping
)Using the resduced dataset, we can use PCA to extract additional information about the patients. Since we are using total scores for this section, the variables must be scaled in order to avoid any overweighted contributions. We find that it takes over 8 dimensions to explain approximately 70% of the variance. The scree plot has no obvious elbow which can be used to determine a cutoff. Using the eigenvalue cutoff at Dimension 7, we can still capture 65% of the variance.
From the correlation and variable contribution plots below, we can make a few observations:
res.pca <- adhd.red.complete %>% prcomp(scale = TRUE)
corrplot(get_pca_var(res.pca)$contrib, is.corr=FALSE) Cummulatively for PC1 to PC7, we can observe from the below that SubstDX is the largest individual contributor to the principal components, followed by Age, Race and the two ADHD and Mood Disorder questionnaires.
The 2D representation of the variable contributions to PC1 and PC2 are shown below. SubstDX has the largest contribution in the direction of the PC2 axis. In the same direction we can identify lesser but nevertheless present contributions from variables such as Violence, Conduct, Cocaine, THC and Alcohol.
fviz_pca_var(res.pca,
col.var = "contrib", # Color by contributions to the PC
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE # Avoid text overlapping
)[MI: add comments]
fviz_pca_biplot(res.pca,
col.var = "contrib", # Color by contributions to the PC
gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
repel = TRUE # Avoid text overlapping
)adhd.complete2 <- adhd.red.complete
adhd.complete2$Suicide <- as.factor(adhd.complete2$Suicide)
set.seed(55)
trainIndex <- createDataPartition(adhd.complete2$Suicide, p = .8, list = FALSE, times = 1)
adhd.training <- adhd.complete2[ trainIndex,]
adhd.testing <- adhd.complete2[-trainIndex,]svm_m <- tune(svm, Suicide ~., data = adhd.training, ranges=list(
kernel=c("linear", "polynomial", "radial", "sigmoid"),
cost=2^(2:8),
epsilon = seq(0,1,0.1)))
summary(svm_m)##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## kernel cost epsilon
## radial 4 0
##
## - best performance: 0.2727273
##
## - Detailed performance results:
## kernel cost epsilon error dispersion
## 1 linear 4 0.0 0.3325758 0.15719367
## 2 polynomial 4 0.0 0.3068182 0.08035304
## 3 radial 4 0.0 0.2727273 0.13184624
## 4 sigmoid 4 0.0 0.3522727 0.14142541
## 5 linear 8 0.0 0.3318182 0.15475493
## 6 polynomial 8 0.0 0.3159091 0.09181993
## 7 radial 8 0.0 0.2977273 0.11736585
## 8 sigmoid 8 0.0 0.3772727 0.15569979
## 9 linear 16 0.0 0.3318182 0.15475493
## 10 polynomial 16 0.0 0.3068182 0.12768142
## 11 radial 16 0.0 0.3333333 0.11741745
## 12 sigmoid 16 0.0 0.3765152 0.14436186
## 13 linear 32 0.0 0.3318182 0.15475493
## 14 polynomial 32 0.0 0.3333333 0.12761898
## 15 radial 32 0.0 0.2984848 0.15419350
## 16 sigmoid 32 0.0 0.3689394 0.17784843
## 17 linear 64 0.0 0.3318182 0.15475493
## 18 polynomial 64 0.0 0.3250000 0.09964033
## 19 radial 64 0.0 0.2984848 0.14712495
## 20 sigmoid 64 0.0 0.3924242 0.13047518
## 21 linear 128 0.0 0.3318182 0.15475493
## 22 polynomial 128 0.0 0.3250000 0.09964033
## 23 radial 128 0.0 0.3166667 0.14225938
## 24 sigmoid 128 0.0 0.3333333 0.11065102
## 25 linear 256 0.0 0.3318182 0.15475493
## 26 polynomial 256 0.0 0.3250000 0.09964033
## 27 radial 256 0.0 0.3166667 0.14225938
## 28 sigmoid 256 0.0 0.3954545 0.11609572
## 29 linear 4 0.1 0.3325758 0.15719367
## 30 polynomial 4 0.1 0.3068182 0.08035304
## 31 radial 4 0.1 0.2727273 0.13184624
## 32 sigmoid 4 0.1 0.3522727 0.14142541
## 33 linear 8 0.1 0.3318182 0.15475493
## 34 polynomial 8 0.1 0.3159091 0.09181993
## 35 radial 8 0.1 0.2977273 0.11736585
## 36 sigmoid 8 0.1 0.3772727 0.15569979
## 37 linear 16 0.1 0.3318182 0.15475493
## 38 polynomial 16 0.1 0.3068182 0.12768142
## 39 radial 16 0.1 0.3333333 0.11741745
## 40 sigmoid 16 0.1 0.3765152 0.14436186
## 41 linear 32 0.1 0.3318182 0.15475493
## 42 polynomial 32 0.1 0.3333333 0.12761898
## 43 radial 32 0.1 0.2984848 0.15419350
## 44 sigmoid 32 0.1 0.3689394 0.17784843
## 45 linear 64 0.1 0.3318182 0.15475493
## 46 polynomial 64 0.1 0.3250000 0.09964033
## 47 radial 64 0.1 0.2984848 0.14712495
## 48 sigmoid 64 0.1 0.3924242 0.13047518
## 49 linear 128 0.1 0.3318182 0.15475493
## 50 polynomial 128 0.1 0.3250000 0.09964033
## 51 radial 128 0.1 0.3166667 0.14225938
## 52 sigmoid 128 0.1 0.3333333 0.11065102
## 53 linear 256 0.1 0.3318182 0.15475493
## 54 polynomial 256 0.1 0.3250000 0.09964033
## 55 radial 256 0.1 0.3166667 0.14225938
## 56 sigmoid 256 0.1 0.3954545 0.11609572
## 57 linear 4 0.2 0.3325758 0.15719367
## 58 polynomial 4 0.2 0.3068182 0.08035304
## 59 radial 4 0.2 0.2727273 0.13184624
## 60 sigmoid 4 0.2 0.3522727 0.14142541
## 61 linear 8 0.2 0.3318182 0.15475493
## 62 polynomial 8 0.2 0.3159091 0.09181993
## 63 radial 8 0.2 0.2977273 0.11736585
## 64 sigmoid 8 0.2 0.3772727 0.15569979
## 65 linear 16 0.2 0.3318182 0.15475493
## 66 polynomial 16 0.2 0.3068182 0.12768142
## 67 radial 16 0.2 0.3333333 0.11741745
## 68 sigmoid 16 0.2 0.3765152 0.14436186
## 69 linear 32 0.2 0.3318182 0.15475493
## 70 polynomial 32 0.2 0.3333333 0.12761898
## 71 radial 32 0.2 0.2984848 0.15419350
## 72 sigmoid 32 0.2 0.3689394 0.17784843
## 73 linear 64 0.2 0.3318182 0.15475493
## 74 polynomial 64 0.2 0.3250000 0.09964033
## 75 radial 64 0.2 0.2984848 0.14712495
## 76 sigmoid 64 0.2 0.3924242 0.13047518
## 77 linear 128 0.2 0.3318182 0.15475493
## 78 polynomial 128 0.2 0.3250000 0.09964033
## 79 radial 128 0.2 0.3166667 0.14225938
## 80 sigmoid 128 0.2 0.3333333 0.11065102
## 81 linear 256 0.2 0.3318182 0.15475493
## 82 polynomial 256 0.2 0.3250000 0.09964033
## 83 radial 256 0.2 0.3166667 0.14225938
## 84 sigmoid 256 0.2 0.3954545 0.11609572
## 85 linear 4 0.3 0.3325758 0.15719367
## 86 polynomial 4 0.3 0.3068182 0.08035304
## 87 radial 4 0.3 0.2727273 0.13184624
## 88 sigmoid 4 0.3 0.3522727 0.14142541
## 89 linear 8 0.3 0.3318182 0.15475493
## 90 polynomial 8 0.3 0.3159091 0.09181993
## 91 radial 8 0.3 0.2977273 0.11736585
## 92 sigmoid 8 0.3 0.3772727 0.15569979
## 93 linear 16 0.3 0.3318182 0.15475493
## 94 polynomial 16 0.3 0.3068182 0.12768142
## 95 radial 16 0.3 0.3333333 0.11741745
## 96 sigmoid 16 0.3 0.3765152 0.14436186
## 97 linear 32 0.3 0.3318182 0.15475493
## 98 polynomial 32 0.3 0.3333333 0.12761898
## 99 radial 32 0.3 0.2984848 0.15419350
## 100 sigmoid 32 0.3 0.3689394 0.17784843
## 101 linear 64 0.3 0.3318182 0.15475493
## 102 polynomial 64 0.3 0.3250000 0.09964033
## 103 radial 64 0.3 0.2984848 0.14712495
## 104 sigmoid 64 0.3 0.3924242 0.13047518
## 105 linear 128 0.3 0.3318182 0.15475493
## 106 polynomial 128 0.3 0.3250000 0.09964033
## 107 radial 128 0.3 0.3166667 0.14225938
## 108 sigmoid 128 0.3 0.3333333 0.11065102
## 109 linear 256 0.3 0.3318182 0.15475493
## 110 polynomial 256 0.3 0.3250000 0.09964033
## 111 radial 256 0.3 0.3166667 0.14225938
## 112 sigmoid 256 0.3 0.3954545 0.11609572
## 113 linear 4 0.4 0.3325758 0.15719367
## 114 polynomial 4 0.4 0.3068182 0.08035304
## 115 radial 4 0.4 0.2727273 0.13184624
## 116 sigmoid 4 0.4 0.3522727 0.14142541
## 117 linear 8 0.4 0.3318182 0.15475493
## 118 polynomial 8 0.4 0.3159091 0.09181993
## 119 radial 8 0.4 0.2977273 0.11736585
## 120 sigmoid 8 0.4 0.3772727 0.15569979
## 121 linear 16 0.4 0.3318182 0.15475493
## 122 polynomial 16 0.4 0.3068182 0.12768142
## 123 radial 16 0.4 0.3333333 0.11741745
## 124 sigmoid 16 0.4 0.3765152 0.14436186
## 125 linear 32 0.4 0.3318182 0.15475493
## 126 polynomial 32 0.4 0.3333333 0.12761898
## 127 radial 32 0.4 0.2984848 0.15419350
## 128 sigmoid 32 0.4 0.3689394 0.17784843
## 129 linear 64 0.4 0.3318182 0.15475493
## 130 polynomial 64 0.4 0.3250000 0.09964033
## 131 radial 64 0.4 0.2984848 0.14712495
## 132 sigmoid 64 0.4 0.3924242 0.13047518
## 133 linear 128 0.4 0.3318182 0.15475493
## 134 polynomial 128 0.4 0.3250000 0.09964033
## 135 radial 128 0.4 0.3166667 0.14225938
## 136 sigmoid 128 0.4 0.3333333 0.11065102
## 137 linear 256 0.4 0.3318182 0.15475493
## 138 polynomial 256 0.4 0.3250000 0.09964033
## 139 radial 256 0.4 0.3166667 0.14225938
## 140 sigmoid 256 0.4 0.3954545 0.11609572
## 141 linear 4 0.5 0.3325758 0.15719367
## 142 polynomial 4 0.5 0.3068182 0.08035304
## 143 radial 4 0.5 0.2727273 0.13184624
## 144 sigmoid 4 0.5 0.3522727 0.14142541
## 145 linear 8 0.5 0.3318182 0.15475493
## 146 polynomial 8 0.5 0.3159091 0.09181993
## 147 radial 8 0.5 0.2977273 0.11736585
## 148 sigmoid 8 0.5 0.3772727 0.15569979
## 149 linear 16 0.5 0.3318182 0.15475493
## 150 polynomial 16 0.5 0.3068182 0.12768142
## 151 radial 16 0.5 0.3333333 0.11741745
## 152 sigmoid 16 0.5 0.3765152 0.14436186
## 153 linear 32 0.5 0.3318182 0.15475493
## 154 polynomial 32 0.5 0.3333333 0.12761898
## 155 radial 32 0.5 0.2984848 0.15419350
## 156 sigmoid 32 0.5 0.3689394 0.17784843
## 157 linear 64 0.5 0.3318182 0.15475493
## 158 polynomial 64 0.5 0.3250000 0.09964033
## 159 radial 64 0.5 0.2984848 0.14712495
## 160 sigmoid 64 0.5 0.3924242 0.13047518
## 161 linear 128 0.5 0.3318182 0.15475493
## 162 polynomial 128 0.5 0.3250000 0.09964033
## 163 radial 128 0.5 0.3166667 0.14225938
## 164 sigmoid 128 0.5 0.3333333 0.11065102
## 165 linear 256 0.5 0.3318182 0.15475493
## 166 polynomial 256 0.5 0.3250000 0.09964033
## 167 radial 256 0.5 0.3166667 0.14225938
## 168 sigmoid 256 0.5 0.3954545 0.11609572
## 169 linear 4 0.6 0.3325758 0.15719367
## 170 polynomial 4 0.6 0.3068182 0.08035304
## 171 radial 4 0.6 0.2727273 0.13184624
## 172 sigmoid 4 0.6 0.3522727 0.14142541
## 173 linear 8 0.6 0.3318182 0.15475493
## 174 polynomial 8 0.6 0.3159091 0.09181993
## 175 radial 8 0.6 0.2977273 0.11736585
## 176 sigmoid 8 0.6 0.3772727 0.15569979
## 177 linear 16 0.6 0.3318182 0.15475493
## 178 polynomial 16 0.6 0.3068182 0.12768142
## 179 radial 16 0.6 0.3333333 0.11741745
## 180 sigmoid 16 0.6 0.3765152 0.14436186
## 181 linear 32 0.6 0.3318182 0.15475493
## 182 polynomial 32 0.6 0.3333333 0.12761898
## 183 radial 32 0.6 0.2984848 0.15419350
## 184 sigmoid 32 0.6 0.3689394 0.17784843
## 185 linear 64 0.6 0.3318182 0.15475493
## 186 polynomial 64 0.6 0.3250000 0.09964033
## 187 radial 64 0.6 0.2984848 0.14712495
## 188 sigmoid 64 0.6 0.3924242 0.13047518
## 189 linear 128 0.6 0.3318182 0.15475493
## 190 polynomial 128 0.6 0.3250000 0.09964033
## 191 radial 128 0.6 0.3166667 0.14225938
## 192 sigmoid 128 0.6 0.3333333 0.11065102
## 193 linear 256 0.6 0.3318182 0.15475493
## 194 polynomial 256 0.6 0.3250000 0.09964033
## 195 radial 256 0.6 0.3166667 0.14225938
## 196 sigmoid 256 0.6 0.3954545 0.11609572
## 197 linear 4 0.7 0.3325758 0.15719367
## 198 polynomial 4 0.7 0.3068182 0.08035304
## 199 radial 4 0.7 0.2727273 0.13184624
## 200 sigmoid 4 0.7 0.3522727 0.14142541
## 201 linear 8 0.7 0.3318182 0.15475493
## 202 polynomial 8 0.7 0.3159091 0.09181993
## 203 radial 8 0.7 0.2977273 0.11736585
## 204 sigmoid 8 0.7 0.3772727 0.15569979
## 205 linear 16 0.7 0.3318182 0.15475493
## 206 polynomial 16 0.7 0.3068182 0.12768142
## 207 radial 16 0.7 0.3333333 0.11741745
## 208 sigmoid 16 0.7 0.3765152 0.14436186
## 209 linear 32 0.7 0.3318182 0.15475493
## 210 polynomial 32 0.7 0.3333333 0.12761898
## 211 radial 32 0.7 0.2984848 0.15419350
## 212 sigmoid 32 0.7 0.3689394 0.17784843
## 213 linear 64 0.7 0.3318182 0.15475493
## 214 polynomial 64 0.7 0.3250000 0.09964033
## 215 radial 64 0.7 0.2984848 0.14712495
## 216 sigmoid 64 0.7 0.3924242 0.13047518
## 217 linear 128 0.7 0.3318182 0.15475493
## 218 polynomial 128 0.7 0.3250000 0.09964033
## 219 radial 128 0.7 0.3166667 0.14225938
## 220 sigmoid 128 0.7 0.3333333 0.11065102
## 221 linear 256 0.7 0.3318182 0.15475493
## 222 polynomial 256 0.7 0.3250000 0.09964033
## 223 radial 256 0.7 0.3166667 0.14225938
## 224 sigmoid 256 0.7 0.3954545 0.11609572
## 225 linear 4 0.8 0.3325758 0.15719367
## 226 polynomial 4 0.8 0.3068182 0.08035304
## 227 radial 4 0.8 0.2727273 0.13184624
## 228 sigmoid 4 0.8 0.3522727 0.14142541
## 229 linear 8 0.8 0.3318182 0.15475493
## 230 polynomial 8 0.8 0.3159091 0.09181993
## 231 radial 8 0.8 0.2977273 0.11736585
## 232 sigmoid 8 0.8 0.3772727 0.15569979
## 233 linear 16 0.8 0.3318182 0.15475493
## 234 polynomial 16 0.8 0.3068182 0.12768142
## 235 radial 16 0.8 0.3333333 0.11741745
## 236 sigmoid 16 0.8 0.3765152 0.14436186
## 237 linear 32 0.8 0.3318182 0.15475493
## 238 polynomial 32 0.8 0.3333333 0.12761898
## 239 radial 32 0.8 0.2984848 0.15419350
## 240 sigmoid 32 0.8 0.3689394 0.17784843
## 241 linear 64 0.8 0.3318182 0.15475493
## 242 polynomial 64 0.8 0.3250000 0.09964033
## 243 radial 64 0.8 0.2984848 0.14712495
## 244 sigmoid 64 0.8 0.3924242 0.13047518
## 245 linear 128 0.8 0.3318182 0.15475493
## 246 polynomial 128 0.8 0.3250000 0.09964033
## 247 radial 128 0.8 0.3166667 0.14225938
## 248 sigmoid 128 0.8 0.3333333 0.11065102
## 249 linear 256 0.8 0.3318182 0.15475493
## 250 polynomial 256 0.8 0.3250000 0.09964033
## 251 radial 256 0.8 0.3166667 0.14225938
## 252 sigmoid 256 0.8 0.3954545 0.11609572
## 253 linear 4 0.9 0.3325758 0.15719367
## 254 polynomial 4 0.9 0.3068182 0.08035304
## 255 radial 4 0.9 0.2727273 0.13184624
## 256 sigmoid 4 0.9 0.3522727 0.14142541
## 257 linear 8 0.9 0.3318182 0.15475493
## 258 polynomial 8 0.9 0.3159091 0.09181993
## 259 radial 8 0.9 0.2977273 0.11736585
## 260 sigmoid 8 0.9 0.3772727 0.15569979
## 261 linear 16 0.9 0.3318182 0.15475493
## 262 polynomial 16 0.9 0.3068182 0.12768142
## 263 radial 16 0.9 0.3333333 0.11741745
## 264 sigmoid 16 0.9 0.3765152 0.14436186
## 265 linear 32 0.9 0.3318182 0.15475493
## 266 polynomial 32 0.9 0.3333333 0.12761898
## 267 radial 32 0.9 0.2984848 0.15419350
## 268 sigmoid 32 0.9 0.3689394 0.17784843
## 269 linear 64 0.9 0.3318182 0.15475493
## 270 polynomial 64 0.9 0.3250000 0.09964033
## 271 radial 64 0.9 0.2984848 0.14712495
## 272 sigmoid 64 0.9 0.3924242 0.13047518
## 273 linear 128 0.9 0.3318182 0.15475493
## 274 polynomial 128 0.9 0.3250000 0.09964033
## 275 radial 128 0.9 0.3166667 0.14225938
## 276 sigmoid 128 0.9 0.3333333 0.11065102
## 277 linear 256 0.9 0.3318182 0.15475493
## 278 polynomial 256 0.9 0.3250000 0.09964033
## 279 radial 256 0.9 0.3166667 0.14225938
## 280 sigmoid 256 0.9 0.3954545 0.11609572
## 281 linear 4 1.0 0.3325758 0.15719367
## 282 polynomial 4 1.0 0.3068182 0.08035304
## 283 radial 4 1.0 0.2727273 0.13184624
## 284 sigmoid 4 1.0 0.3522727 0.14142541
## 285 linear 8 1.0 0.3318182 0.15475493
## 286 polynomial 8 1.0 0.3159091 0.09181993
## 287 radial 8 1.0 0.2977273 0.11736585
## 288 sigmoid 8 1.0 0.3772727 0.15569979
## 289 linear 16 1.0 0.3318182 0.15475493
## 290 polynomial 16 1.0 0.3068182 0.12768142
## 291 radial 16 1.0 0.3333333 0.11741745
## 292 sigmoid 16 1.0 0.3765152 0.14436186
## 293 linear 32 1.0 0.3318182 0.15475493
## 294 polynomial 32 1.0 0.3333333 0.12761898
## 295 radial 32 1.0 0.2984848 0.15419350
## 296 sigmoid 32 1.0 0.3689394 0.17784843
## 297 linear 64 1.0 0.3318182 0.15475493
## 298 polynomial 64 1.0 0.3250000 0.09964033
## 299 radial 64 1.0 0.2984848 0.14712495
## 300 sigmoid 64 1.0 0.3924242 0.13047518
## 301 linear 128 1.0 0.3318182 0.15475493
## 302 polynomial 128 1.0 0.3250000 0.09964033
## 303 radial 128 1.0 0.3166667 0.14225938
## 304 sigmoid 128 1.0 0.3333333 0.11065102
## 305 linear 256 1.0 0.3318182 0.15475493
## 306 polynomial 256 1.0 0.3250000 0.09964033
## 307 radial 256 1.0 0.3166667 0.14225938
## 308 sigmoid 256 1.0 0.3954545 0.11609572
svm_m_best <- svm_m$best.model
svm_pred <- predict(svm_m_best, newdata = adhd.testing, type="class")
svm_cm <- confusionMatrix(svm_pred, adhd.testing$Suicide)
svm_cm$table## Reference
## Prediction 0 1
## 0 15 7
## 1 4 2
## [1] 0.6071429
adhd_raw2 <- adhd_raw %>% select(-MD.TOTAL, -ADHD.Total, -Initial, -Psych.meds.)
adhd_raw2 <- adhd_raw2[complete.cases(adhd_raw2), ]
adhd_raw2$Suicide <- as.factor(adhd_raw2$Suicide)
set.seed(55)
trainIndex <- createDataPartition(adhd_raw2$Suicide, p = .8, list = FALSE, times = 1)
adhd_pca.training <- adhd_raw2[ trainIndex,]
adhd_pca.testing <- adhd_raw2[-trainIndex,]svm_pca_m <- prcomp(select(adhd_pca.training, -Suicide), center = TRUE, scale = TRUE)
summary(svm_pca_m)## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 3.4453 2.2528 1.55906 1.37007 1.34635 1.28866 1.27728
## Proportion of Variance 0.2422 0.1036 0.04961 0.03831 0.03699 0.03389 0.03329
## Cumulative Proportion 0.2422 0.3458 0.39543 0.43374 0.47073 0.50462 0.53792
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 1.22380 1.19497 1.16886 1.08459 1.05487 1.04846 1.03414
## Proportion of Variance 0.03057 0.02914 0.02788 0.02401 0.02271 0.02243 0.02183
## Cumulative Proportion 0.56848 0.59762 0.62551 0.64951 0.67222 0.69466 0.71648
## PC15 PC16 PC17 PC18 PC19 PC20 PC21
## Standard deviation 1.00751 0.96060 0.95823 0.91873 0.87611 0.8400 0.8340
## Proportion of Variance 0.02072 0.01883 0.01874 0.01723 0.01566 0.0144 0.0142
## Cumulative Proportion 0.73720 0.75603 0.77477 0.79199 0.80766 0.8221 0.8363
## PC22 PC23 PC24 PC25 PC26 PC27 PC28
## Standard deviation 0.81062 0.78359 0.76540 0.69345 0.67529 0.64590 0.63526
## Proportion of Variance 0.01341 0.01253 0.01196 0.00981 0.00931 0.00851 0.00824
## Cumulative Proportion 0.84966 0.86219 0.87415 0.88396 0.89327 0.90178 0.91002
## PC29 PC30 PC31 PC32 PC33 PC34 PC35
## Standard deviation 0.62827 0.60068 0.59058 0.57195 0.56669 0.54155 0.52992
## Proportion of Variance 0.00806 0.00736 0.00712 0.00668 0.00655 0.00599 0.00573
## Cumulative Proportion 0.91808 0.92544 0.93256 0.93923 0.94579 0.95177 0.95750
## PC36 PC37 PC38 PC39 PC40 PC41 PC42
## Standard deviation 0.5050 0.48908 0.47771 0.44431 0.42168 0.39567 0.37971
## Proportion of Variance 0.0052 0.00488 0.00466 0.00403 0.00363 0.00319 0.00294
## Cumulative Proportion 0.9627 0.96759 0.97225 0.97628 0.97990 0.98310 0.98604
## PC43 PC44 PC45 PC46 PC47 PC48 PC49
## Standard deviation 0.36480 0.34923 0.32573 0.31662 0.2968 0.27565 0.24184
## Proportion of Variance 0.00272 0.00249 0.00217 0.00205 0.0018 0.00155 0.00119
## Cumulative Proportion 0.98876 0.99125 0.99341 0.99546 0.9973 0.99881 1.00000
#check for normality and outliers, since we scaled the the original features, the qqplot should be normal as well
qqnorm(svm_pca_m[["x"]][,1])fviz_pca_ind(svm_pca_m, label="none",
habillage = adhd_pca.training$Suicide,
addEllipses = TRUE, palette = "jco")#Kaiser rule: select PCs with eigenvalues of at least 1
reduced_dim <- get_eigenvalue(svm_pca_m) %>% filter(eigenvalue > 1)
reduced_dimadhd_pca.training_reduced <- cbind(as.data.frame(svm_pca_m$x[,c(1:nrow(reduced_dim))]), Suicide = adhd_pca.training$Suicide)
head(adhd_pca.training_reduced)#rotate the test data using the predict function in the same rotation as the training data
# rotation done with PC that have eigenvalue < 1 dropped
adhd_pca.testing_reduced <- cbind(as.data.frame(predict(svm_pca_m, newdata = select(adhd_pca.testing, -Suicide))[,c(1:nrow(reduced_dim))]),Suicide = adhd_pca.testing$Suicide)
head(adhd_pca.testing_reduced)svm_pca_m <- tune(svm, Suicide ~., data = adhd_pca.training_reduced, ranges=list(
kernel=c("linear", "polynomial", "radial", "sigmoid"),
cost=2^(2:8),
epsilon = seq(0,1,0.1)))
summary(svm_pca_m)##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## kernel cost epsilon
## sigmoid 4 0
##
## - best performance: 0.2628788
##
## - Detailed performance results:
## kernel cost epsilon error dispersion
## 1 linear 4 0.0 0.3318182 0.08396607
## 2 polynomial 4 0.0 0.2878788 0.09360481
## 3 radial 4 0.0 0.2825758 0.10406063
## 4 sigmoid 4 0.0 0.2628788 0.05680976
## 5 linear 8 0.0 0.3227273 0.07346706
## 6 polynomial 8 0.0 0.2962121 0.14924435
## 7 radial 8 0.0 0.2901515 0.13343517
## 8 sigmoid 8 0.0 0.2969697 0.14248333
## 9 linear 16 0.0 0.3227273 0.07346706
## 10 polynomial 16 0.0 0.3136364 0.13992725
## 11 radial 16 0.0 0.3159091 0.12821968
## 12 sigmoid 16 0.0 0.3242424 0.08890897
## 13 linear 32 0.0 0.3227273 0.07346706
## 14 polynomial 32 0.0 0.3393939 0.15586353
## 15 radial 32 0.0 0.3159091 0.12821968
## 16 sigmoid 32 0.0 0.3772727 0.09270119
## 17 linear 64 0.0 0.3227273 0.07346706
## 18 polynomial 64 0.0 0.3393939 0.15586353
## 19 radial 64 0.0 0.3159091 0.12821968
## 20 sigmoid 64 0.0 0.3568182 0.10881758
## 21 linear 128 0.0 0.3227273 0.07346706
## 22 polynomial 128 0.0 0.3393939 0.15586353
## 23 radial 128 0.0 0.3159091 0.12821968
## 24 sigmoid 128 0.0 0.3659091 0.13533328
## 25 linear 256 0.0 0.3227273 0.07346706
## 26 polynomial 256 0.0 0.3393939 0.15586353
## 27 radial 256 0.0 0.3159091 0.12821968
## 28 sigmoid 256 0.0 0.3431818 0.11638374
## 29 linear 4 0.1 0.3318182 0.08396607
## 30 polynomial 4 0.1 0.2878788 0.09360481
## 31 radial 4 0.1 0.2825758 0.10406063
## 32 sigmoid 4 0.1 0.2628788 0.05680976
## 33 linear 8 0.1 0.3227273 0.07346706
## 34 polynomial 8 0.1 0.2962121 0.14924435
## 35 radial 8 0.1 0.2901515 0.13343517
## 36 sigmoid 8 0.1 0.2969697 0.14248333
## 37 linear 16 0.1 0.3227273 0.07346706
## 38 polynomial 16 0.1 0.3136364 0.13992725
## 39 radial 16 0.1 0.3159091 0.12821968
## 40 sigmoid 16 0.1 0.3242424 0.08890897
## 41 linear 32 0.1 0.3227273 0.07346706
## 42 polynomial 32 0.1 0.3393939 0.15586353
## 43 radial 32 0.1 0.3159091 0.12821968
## 44 sigmoid 32 0.1 0.3772727 0.09270119
## 45 linear 64 0.1 0.3227273 0.07346706
## 46 polynomial 64 0.1 0.3393939 0.15586353
## 47 radial 64 0.1 0.3159091 0.12821968
## 48 sigmoid 64 0.1 0.3568182 0.10881758
## 49 linear 128 0.1 0.3227273 0.07346706
## 50 polynomial 128 0.1 0.3393939 0.15586353
## 51 radial 128 0.1 0.3159091 0.12821968
## 52 sigmoid 128 0.1 0.3659091 0.13533328
## 53 linear 256 0.1 0.3227273 0.07346706
## 54 polynomial 256 0.1 0.3393939 0.15586353
## 55 radial 256 0.1 0.3159091 0.12821968
## 56 sigmoid 256 0.1 0.3431818 0.11638374
## 57 linear 4 0.2 0.3318182 0.08396607
## 58 polynomial 4 0.2 0.2878788 0.09360481
## 59 radial 4 0.2 0.2825758 0.10406063
## 60 sigmoid 4 0.2 0.2628788 0.05680976
## 61 linear 8 0.2 0.3227273 0.07346706
## 62 polynomial 8 0.2 0.2962121 0.14924435
## 63 radial 8 0.2 0.2901515 0.13343517
## 64 sigmoid 8 0.2 0.2969697 0.14248333
## 65 linear 16 0.2 0.3227273 0.07346706
## 66 polynomial 16 0.2 0.3136364 0.13992725
## 67 radial 16 0.2 0.3159091 0.12821968
## 68 sigmoid 16 0.2 0.3242424 0.08890897
## 69 linear 32 0.2 0.3227273 0.07346706
## 70 polynomial 32 0.2 0.3393939 0.15586353
## 71 radial 32 0.2 0.3159091 0.12821968
## 72 sigmoid 32 0.2 0.3772727 0.09270119
## 73 linear 64 0.2 0.3227273 0.07346706
## 74 polynomial 64 0.2 0.3393939 0.15586353
## 75 radial 64 0.2 0.3159091 0.12821968
## 76 sigmoid 64 0.2 0.3568182 0.10881758
## 77 linear 128 0.2 0.3227273 0.07346706
## 78 polynomial 128 0.2 0.3393939 0.15586353
## 79 radial 128 0.2 0.3159091 0.12821968
## 80 sigmoid 128 0.2 0.3659091 0.13533328
## 81 linear 256 0.2 0.3227273 0.07346706
## 82 polynomial 256 0.2 0.3393939 0.15586353
## 83 radial 256 0.2 0.3159091 0.12821968
## 84 sigmoid 256 0.2 0.3431818 0.11638374
## 85 linear 4 0.3 0.3318182 0.08396607
## 86 polynomial 4 0.3 0.2878788 0.09360481
## 87 radial 4 0.3 0.2825758 0.10406063
## 88 sigmoid 4 0.3 0.2628788 0.05680976
## 89 linear 8 0.3 0.3227273 0.07346706
## 90 polynomial 8 0.3 0.2962121 0.14924435
## 91 radial 8 0.3 0.2901515 0.13343517
## 92 sigmoid 8 0.3 0.2969697 0.14248333
## 93 linear 16 0.3 0.3227273 0.07346706
## 94 polynomial 16 0.3 0.3136364 0.13992725
## 95 radial 16 0.3 0.3159091 0.12821968
## 96 sigmoid 16 0.3 0.3242424 0.08890897
## 97 linear 32 0.3 0.3227273 0.07346706
## 98 polynomial 32 0.3 0.3393939 0.15586353
## 99 radial 32 0.3 0.3159091 0.12821968
## 100 sigmoid 32 0.3 0.3772727 0.09270119
## 101 linear 64 0.3 0.3227273 0.07346706
## 102 polynomial 64 0.3 0.3393939 0.15586353
## 103 radial 64 0.3 0.3159091 0.12821968
## 104 sigmoid 64 0.3 0.3568182 0.10881758
## 105 linear 128 0.3 0.3227273 0.07346706
## 106 polynomial 128 0.3 0.3393939 0.15586353
## 107 radial 128 0.3 0.3159091 0.12821968
## 108 sigmoid 128 0.3 0.3659091 0.13533328
## 109 linear 256 0.3 0.3227273 0.07346706
## 110 polynomial 256 0.3 0.3393939 0.15586353
## 111 radial 256 0.3 0.3159091 0.12821968
## 112 sigmoid 256 0.3 0.3431818 0.11638374
## 113 linear 4 0.4 0.3318182 0.08396607
## 114 polynomial 4 0.4 0.2878788 0.09360481
## 115 radial 4 0.4 0.2825758 0.10406063
## 116 sigmoid 4 0.4 0.2628788 0.05680976
## 117 linear 8 0.4 0.3227273 0.07346706
## 118 polynomial 8 0.4 0.2962121 0.14924435
## 119 radial 8 0.4 0.2901515 0.13343517
## 120 sigmoid 8 0.4 0.2969697 0.14248333
## 121 linear 16 0.4 0.3227273 0.07346706
## 122 polynomial 16 0.4 0.3136364 0.13992725
## 123 radial 16 0.4 0.3159091 0.12821968
## 124 sigmoid 16 0.4 0.3242424 0.08890897
## 125 linear 32 0.4 0.3227273 0.07346706
## 126 polynomial 32 0.4 0.3393939 0.15586353
## 127 radial 32 0.4 0.3159091 0.12821968
## 128 sigmoid 32 0.4 0.3772727 0.09270119
## 129 linear 64 0.4 0.3227273 0.07346706
## 130 polynomial 64 0.4 0.3393939 0.15586353
## 131 radial 64 0.4 0.3159091 0.12821968
## 132 sigmoid 64 0.4 0.3568182 0.10881758
## 133 linear 128 0.4 0.3227273 0.07346706
## 134 polynomial 128 0.4 0.3393939 0.15586353
## 135 radial 128 0.4 0.3159091 0.12821968
## 136 sigmoid 128 0.4 0.3659091 0.13533328
## 137 linear 256 0.4 0.3227273 0.07346706
## 138 polynomial 256 0.4 0.3393939 0.15586353
## 139 radial 256 0.4 0.3159091 0.12821968
## 140 sigmoid 256 0.4 0.3431818 0.11638374
## 141 linear 4 0.5 0.3318182 0.08396607
## 142 polynomial 4 0.5 0.2878788 0.09360481
## 143 radial 4 0.5 0.2825758 0.10406063
## 144 sigmoid 4 0.5 0.2628788 0.05680976
## 145 linear 8 0.5 0.3227273 0.07346706
## 146 polynomial 8 0.5 0.2962121 0.14924435
## 147 radial 8 0.5 0.2901515 0.13343517
## 148 sigmoid 8 0.5 0.2969697 0.14248333
## 149 linear 16 0.5 0.3227273 0.07346706
## 150 polynomial 16 0.5 0.3136364 0.13992725
## 151 radial 16 0.5 0.3159091 0.12821968
## 152 sigmoid 16 0.5 0.3242424 0.08890897
## 153 linear 32 0.5 0.3227273 0.07346706
## 154 polynomial 32 0.5 0.3393939 0.15586353
## 155 radial 32 0.5 0.3159091 0.12821968
## 156 sigmoid 32 0.5 0.3772727 0.09270119
## 157 linear 64 0.5 0.3227273 0.07346706
## 158 polynomial 64 0.5 0.3393939 0.15586353
## 159 radial 64 0.5 0.3159091 0.12821968
## 160 sigmoid 64 0.5 0.3568182 0.10881758
## 161 linear 128 0.5 0.3227273 0.07346706
## 162 polynomial 128 0.5 0.3393939 0.15586353
## 163 radial 128 0.5 0.3159091 0.12821968
## 164 sigmoid 128 0.5 0.3659091 0.13533328
## 165 linear 256 0.5 0.3227273 0.07346706
## 166 polynomial 256 0.5 0.3393939 0.15586353
## 167 radial 256 0.5 0.3159091 0.12821968
## 168 sigmoid 256 0.5 0.3431818 0.11638374
## 169 linear 4 0.6 0.3318182 0.08396607
## 170 polynomial 4 0.6 0.2878788 0.09360481
## 171 radial 4 0.6 0.2825758 0.10406063
## 172 sigmoid 4 0.6 0.2628788 0.05680976
## 173 linear 8 0.6 0.3227273 0.07346706
## 174 polynomial 8 0.6 0.2962121 0.14924435
## 175 radial 8 0.6 0.2901515 0.13343517
## 176 sigmoid 8 0.6 0.2969697 0.14248333
## 177 linear 16 0.6 0.3227273 0.07346706
## 178 polynomial 16 0.6 0.3136364 0.13992725
## 179 radial 16 0.6 0.3159091 0.12821968
## 180 sigmoid 16 0.6 0.3242424 0.08890897
## 181 linear 32 0.6 0.3227273 0.07346706
## 182 polynomial 32 0.6 0.3393939 0.15586353
## 183 radial 32 0.6 0.3159091 0.12821968
## 184 sigmoid 32 0.6 0.3772727 0.09270119
## 185 linear 64 0.6 0.3227273 0.07346706
## 186 polynomial 64 0.6 0.3393939 0.15586353
## 187 radial 64 0.6 0.3159091 0.12821968
## 188 sigmoid 64 0.6 0.3568182 0.10881758
## 189 linear 128 0.6 0.3227273 0.07346706
## 190 polynomial 128 0.6 0.3393939 0.15586353
## 191 radial 128 0.6 0.3159091 0.12821968
## 192 sigmoid 128 0.6 0.3659091 0.13533328
## 193 linear 256 0.6 0.3227273 0.07346706
## 194 polynomial 256 0.6 0.3393939 0.15586353
## 195 radial 256 0.6 0.3159091 0.12821968
## 196 sigmoid 256 0.6 0.3431818 0.11638374
## 197 linear 4 0.7 0.3318182 0.08396607
## 198 polynomial 4 0.7 0.2878788 0.09360481
## 199 radial 4 0.7 0.2825758 0.10406063
## 200 sigmoid 4 0.7 0.2628788 0.05680976
## 201 linear 8 0.7 0.3227273 0.07346706
## 202 polynomial 8 0.7 0.2962121 0.14924435
## 203 radial 8 0.7 0.2901515 0.13343517
## 204 sigmoid 8 0.7 0.2969697 0.14248333
## 205 linear 16 0.7 0.3227273 0.07346706
## 206 polynomial 16 0.7 0.3136364 0.13992725
## 207 radial 16 0.7 0.3159091 0.12821968
## 208 sigmoid 16 0.7 0.3242424 0.08890897
## 209 linear 32 0.7 0.3227273 0.07346706
## 210 polynomial 32 0.7 0.3393939 0.15586353
## 211 radial 32 0.7 0.3159091 0.12821968
## 212 sigmoid 32 0.7 0.3772727 0.09270119
## 213 linear 64 0.7 0.3227273 0.07346706
## 214 polynomial 64 0.7 0.3393939 0.15586353
## 215 radial 64 0.7 0.3159091 0.12821968
## 216 sigmoid 64 0.7 0.3568182 0.10881758
## 217 linear 128 0.7 0.3227273 0.07346706
## 218 polynomial 128 0.7 0.3393939 0.15586353
## 219 radial 128 0.7 0.3159091 0.12821968
## 220 sigmoid 128 0.7 0.3659091 0.13533328
## 221 linear 256 0.7 0.3227273 0.07346706
## 222 polynomial 256 0.7 0.3393939 0.15586353
## 223 radial 256 0.7 0.3159091 0.12821968
## 224 sigmoid 256 0.7 0.3431818 0.11638374
## 225 linear 4 0.8 0.3318182 0.08396607
## 226 polynomial 4 0.8 0.2878788 0.09360481
## 227 radial 4 0.8 0.2825758 0.10406063
## 228 sigmoid 4 0.8 0.2628788 0.05680976
## 229 linear 8 0.8 0.3227273 0.07346706
## 230 polynomial 8 0.8 0.2962121 0.14924435
## 231 radial 8 0.8 0.2901515 0.13343517
## 232 sigmoid 8 0.8 0.2969697 0.14248333
## 233 linear 16 0.8 0.3227273 0.07346706
## 234 polynomial 16 0.8 0.3136364 0.13992725
## 235 radial 16 0.8 0.3159091 0.12821968
## 236 sigmoid 16 0.8 0.3242424 0.08890897
## 237 linear 32 0.8 0.3227273 0.07346706
## 238 polynomial 32 0.8 0.3393939 0.15586353
## 239 radial 32 0.8 0.3159091 0.12821968
## 240 sigmoid 32 0.8 0.3772727 0.09270119
## 241 linear 64 0.8 0.3227273 0.07346706
## 242 polynomial 64 0.8 0.3393939 0.15586353
## 243 radial 64 0.8 0.3159091 0.12821968
## 244 sigmoid 64 0.8 0.3568182 0.10881758
## 245 linear 128 0.8 0.3227273 0.07346706
## 246 polynomial 128 0.8 0.3393939 0.15586353
## 247 radial 128 0.8 0.3159091 0.12821968
## 248 sigmoid 128 0.8 0.3659091 0.13533328
## 249 linear 256 0.8 0.3227273 0.07346706
## 250 polynomial 256 0.8 0.3393939 0.15586353
## 251 radial 256 0.8 0.3159091 0.12821968
## 252 sigmoid 256 0.8 0.3431818 0.11638374
## 253 linear 4 0.9 0.3318182 0.08396607
## 254 polynomial 4 0.9 0.2878788 0.09360481
## 255 radial 4 0.9 0.2825758 0.10406063
## 256 sigmoid 4 0.9 0.2628788 0.05680976
## 257 linear 8 0.9 0.3227273 0.07346706
## 258 polynomial 8 0.9 0.2962121 0.14924435
## 259 radial 8 0.9 0.2901515 0.13343517
## 260 sigmoid 8 0.9 0.2969697 0.14248333
## 261 linear 16 0.9 0.3227273 0.07346706
## 262 polynomial 16 0.9 0.3136364 0.13992725
## 263 radial 16 0.9 0.3159091 0.12821968
## 264 sigmoid 16 0.9 0.3242424 0.08890897
## 265 linear 32 0.9 0.3227273 0.07346706
## 266 polynomial 32 0.9 0.3393939 0.15586353
## 267 radial 32 0.9 0.3159091 0.12821968
## 268 sigmoid 32 0.9 0.3772727 0.09270119
## 269 linear 64 0.9 0.3227273 0.07346706
## 270 polynomial 64 0.9 0.3393939 0.15586353
## 271 radial 64 0.9 0.3159091 0.12821968
## 272 sigmoid 64 0.9 0.3568182 0.10881758
## 273 linear 128 0.9 0.3227273 0.07346706
## 274 polynomial 128 0.9 0.3393939 0.15586353
## 275 radial 128 0.9 0.3159091 0.12821968
## 276 sigmoid 128 0.9 0.3659091 0.13533328
## 277 linear 256 0.9 0.3227273 0.07346706
## 278 polynomial 256 0.9 0.3393939 0.15586353
## 279 radial 256 0.9 0.3159091 0.12821968
## 280 sigmoid 256 0.9 0.3431818 0.11638374
## 281 linear 4 1.0 0.3318182 0.08396607
## 282 polynomial 4 1.0 0.2878788 0.09360481
## 283 radial 4 1.0 0.2825758 0.10406063
## 284 sigmoid 4 1.0 0.2628788 0.05680976
## 285 linear 8 1.0 0.3227273 0.07346706
## 286 polynomial 8 1.0 0.2962121 0.14924435
## 287 radial 8 1.0 0.2901515 0.13343517
## 288 sigmoid 8 1.0 0.2969697 0.14248333
## 289 linear 16 1.0 0.3227273 0.07346706
## 290 polynomial 16 1.0 0.3136364 0.13992725
## 291 radial 16 1.0 0.3159091 0.12821968
## 292 sigmoid 16 1.0 0.3242424 0.08890897
## 293 linear 32 1.0 0.3227273 0.07346706
## 294 polynomial 32 1.0 0.3393939 0.15586353
## 295 radial 32 1.0 0.3159091 0.12821968
## 296 sigmoid 32 1.0 0.3772727 0.09270119
## 297 linear 64 1.0 0.3227273 0.07346706
## 298 polynomial 64 1.0 0.3393939 0.15586353
## 299 radial 64 1.0 0.3159091 0.12821968
## 300 sigmoid 64 1.0 0.3568182 0.10881758
## 301 linear 128 1.0 0.3227273 0.07346706
## 302 polynomial 128 1.0 0.3393939 0.15586353
## 303 radial 128 1.0 0.3159091 0.12821968
## 304 sigmoid 128 1.0 0.3659091 0.13533328
## 305 linear 256 1.0 0.3227273 0.07346706
## 306 polynomial 256 1.0 0.3393939 0.15586353
## 307 radial 256 1.0 0.3159091 0.12821968
## 308 sigmoid 256 1.0 0.3431818 0.11638374
svm_pca_m_best <- svm_pca_m$best.model
svm_pca_pred <- predict(svm_pca_m_best, newdata = adhd_pca.testing_reduced, type="class")
svm_pca_cm <- confusionMatrix(svm_pca_pred, adhd_pca.testing_reduced$Suicide)
svm_pca_cm$table## Reference
## Prediction 0 1
## 0 14 7
## 1 5 2
## [1] 0.5714286